convolutional network
Material
In the supplementary material, we provide additional information and details in A.1. This section covers the introduction of data, key parameter settings, comparisons with baselines, optimization methods, and the algorithm process of our method. Furthermore, A.2 presents supplementary experiments for our model, including visualization experiments and replication studies. Additionally, we discuss the reasons behind utilizing hypergraphs as the temporal encoder in A.3. Finally, the limitations and broader impacts of our work are discussed in A.4. A.1 Data and Implementation Details Data. The statistical information of the aforementioned four real-world datasets is presented in Table 4.
From Trainable Negative Depth to Edge Heterophily in Graphs
Finding the proper depth d of a graph convolutional network (GCN) that provides strong representation ability has drawn significant attention, yet nonetheless largely remains an open problem for the graph learning community. Although noteworthy progress has been made, the depth or the number of layers of a corresponding GCN is realized by a series of graph convolution operations, which naturally makes da positive integer (d N+). An interesting question is whether breaking the constraint of N+ by making d a real number (d R) can bring new insights into graph learning mechanisms. In this work, by redefining GCN's depth d as a trainable parameter continuously adjustable within (,+), we open a new door of controlling its signal processing capability to model graph homophily/heterophily (nodes with similar/dissimilar labels/attributes tend to be inter-connected). A simple and powerful GCN model TEDGCN, is proposed to retain the simplicity of GCN and meanwhile automatically search for the optimal d without the prior knowledge regarding whether the input graph is homophilic or heterophilic. Negative-valued dintrinsically enables high-pass frequency filtering functionality via augmented topology for graph heterophily. Extensive experiments demonstrate the superiority of TEDGCN on node classification tasks for a variety of homophilic and heterophilic graphs.
Locality defeats the curse of dimensionality in convolutional teacher-student scenarios
Convolutional neural networks perform a local and translationally-invariant treatment of the data: quantifying which of these two aspects is central to their success remains a challenge. We study this problem within a teacher-student framework for kernel regression, using'convolutional' kernels inspired by the neural tangent kernel of simple convolutional architectures of given filter size. Using heuristic methods from physics, we find in the ridgeless case that locality is key in determining the learning curve exponent ฮฒ (that relates the test error t P ฮฒ to the size of the training set P), whereas translational invariance is not. In particular, if the filter size of the teacher tis smaller than that of the student s, ฮฒ is a function of s only and does not depend on the input dimension. We confirm our predictions on ฮฒ empirically. We conclude by proving, under a natural universality assumption, that performing kernel regression with a ridge that decreases with the size of the training set leads to similar learning curve exponents to those we obtain in the ridgeless case.
Efficient Equivariant Network
Convolutional neural networks (CNNs) have dominated the field of Computer Vision and achieved great success due to their built-in translation equivariance. Group equivariant CNNs (G-CNNs) that incorporate more equivariance can significantly improve the performance of conventional CNNs. However, G-CNNs are faced with two major challenges: spatial-agnostic problem and expensive computational cost. In this work, we propose a general framework of previous equivariant models, which includes G-CNNs and equivariant self-attention layers as special cases.